146 research outputs found
Anytime planning for agent behaviour
For an agent to act successfully in a complex and dynamic environment (such as a computer game)it must have a method of generating future behaviour that meets the demands of its environment. One such method is anytime planning. This paper discusses the problems and benefits associated with making a planning system work under the anytime paradigm, and introduces Anytime-UMCP (A-UMCP), an anytime version of the UMCP hierarchical task network (HTN) planner [Erol, 1995]. It also covers the necessary abilities an agent must have in order to execute plans produced by an anytime hierarchical task network planner
Exploring Design Space For An Integrated Intelligent System
Understanding the trade-offs available in the design space of intelligent systems is a major unaddressed element in the study of Artificial Intelligence. In this paper we approach this problem in two ways. First, we discuss the development of our integrated robotic system in terms of its trajectory through design space. Second, we demonstrate the practical implications of architectural design decisions by using this system as an experimental platform for comparing behaviourally similar yet architecturally different systems. The results of this show that our system occupies a "sweet spot" in design space in terms of the cost of moving information between processing components
Active Inference for Integrated State-Estimation, Control, and Learning
This work presents an approach for control, state-estimation and learning
model (hyper)parameters for robotic manipulators. It is based on the active
inference framework, prominent in computational neuroscience as a theory of the
brain, where behaviour arises from minimizing variational free-energy. The
robotic manipulator shows adaptive and robust behaviour compared to
state-of-the-art methods. Additionally, we show the exact relationship to
classic methods such as PID control. Finally, we show that by learning a
temporal parameter and model variances, our approach can deal with unmodelled
dynamics, damps oscillations, and is robust against disturbances and poor
initial parameters. The approach is validated on the `Franka Emika Panda' 7 DoF
manipulator.Comment: 7 pages, 6 figures, accepted for presentation at the International
Conference on Robotics and Automation (ICRA) 202
Effects of Training Data Variation and Temporal Representation in a QSR-Based Action Prediction System
Understanding of behaviour is a crucial skill for Artificial Intelligence systems expected to interact with external agents – whether other AI systems, or humans, in scenarios involving co-operation, such as domestic robots capable of helping out with household jobs, or disaster relief robots expected to collaborate and lend assistance to others. It is useful for such systems to be able to quickly learn and re-use models and skills in new situations. Our work centres around a behaviourlearning system utilising Qualitative Spatial Relations to lessen the amount of training data required by the system, and to aid generalisation. In this paper, we provide an analysis of the advantages provided to our system by the use of QSRs. We provide a comparison of a variety of machine learning techniques utilising both quantitative and qualitative representations, and show the effects of varying amounts of training data and temporal representations upon the system. The subject of our work is the game of simulated RoboCup Soccer Keepaway. Our results show that employing QSRs provides clear advantages in scenarios where training data is limited, and provides for better generalisation performance in classifiers. In addition, we show that adopting a qualitative representation of time can provide significant performance gains for QSR systems
Bootstrapping Probabilistic Models of Qualitative Spatial Relations for Active Visual Object Search
In many real world applications, autonomous mobile robots are required to observe or retrieve objects in their environment, despite not having accurate estimates of the objects ’ locations. Finding objects in real-world settings is a non-trivial task, given the complexity and the dynamics of human environments. However, by understanding and exploiting the structure of such environments, e.g. where objects are commonly placed as part of everyday activities, robots can perform search tasks more efficiently and effectively than without such knowledge. In this paper we investigate how probabilistic models of qualitative spatial relations can improve the performance in object search tasks. Specifically, we learn Gaussian Mixture Models of spatial relations between object classes from descriptive statistics of real office environments. Experimental results with a range of sensor models suggest that our model improves overall performance in object search tasks.
Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work
Deep networks thrive when trained on large scale data collections. This has
given ImageNet a central role in the development of deep architectures for
visual object classification. However, ImageNet was created during a specific
period in time, and as such it is prone to aging, as well as dataset bias
issues. Moving beyond fixed training datasets will lead to more robust visual
systems, especially when deployed on robots in new environments which must
train on the objects they encounter there. To make this possible, it is
important to break free from the need for manual annotators. Recent work has
begun to investigate how to use the massive amount of images available on the
Web in place of manual image annotations. We contribute to this research thread
with two findings: (1) a study correlating a given level of noisily labels to
the expected drop in accuracy, for two deep architectures, on two different
types of noise, that clearly identifies GoogLeNet as a suitable architecture
for learning from Web data; (2) a recipe for the creation of Web datasets with
minimal noise and maximum visual variability, based on a visual and natural
language processing concept expansion strategy. By combining these two results,
we obtain a method for learning powerful deep object models automatically from
the Web. We confirm the effectiveness of our approach through object
categorization experiments using our Web-derived version of ImageNet on a
popular robot vision benchmark database, and on a lifelong object discovery
task on a mobile robot.Comment: 8 pages, 7 figures, 3 table
Home alone: autonomous extension and correction of spatial representations
In this paper we present an account
of the problems faced by a mobile robot given
an incomplete tour of an unknown environment,
and introduce a collection of techniques which can
generate successful behaviour even in the presence
of such problems. Underlying our approach is the
principle that an autonomous system must be motivated
to act to gather new knowledge, and to validate
and correct existing knowledge. This principle is
embodied in Dora, a mobile robot which features
the aforementioned techniques: shared representations,
non-monotonic reasoning, and goal generation
and management. To demonstrate how well this
collection of techniques work in real-world situations
we present a comprehensive analysis of the Dora
system’s performance over multiple tours in an indoor
environment. In this analysis Dora successfully
completed 18 of 21 attempted runs, with all but
3 of these successes requiring one or more of the
integrated techniques to recover from problems
One Risk to Rule Them All: Addressing Distributional Shift in Offline Reinforcement Learning via Risk-Aversion
Offline reinforcement learning (RL) is suitable for safety-critical domains
where online exploration is not feasible. In such domains, decision-making
should take into consideration the risk of catastrophic outcomes. In other
words, decision-making should be risk-averse. An additional challenge of
offline RL is avoiding distributional shift, i.e. ensuring that state-action
pairs visited by the policy remain near those in the dataset. Previous works on
risk in offline RL combine offline RL techniques (to avoid distributional
shift), with risk-sensitive RL algorithms (to achieve risk-aversion). In this
work, we propose risk-aversion as a mechanism to jointly address both of these
issues. We propose a model-based approach, and use an ensemble of models to
estimate epistemic uncertainty, in addition to aleatoric uncertainty. We train
a policy that is risk-averse, and avoids high uncertainty actions.
Risk-aversion to epistemic uncertainty prevents distributional shift, as areas
not covered by the dataset have high epistemic uncertainty. Risk-aversion to
aleatoric uncertainty discourages actions that are inherently risky due to
environment stochasticity. Thus, by only introducing risk-aversion, we avoid
distributional shift in addition to achieving risk-aversion to aleatoric risk.
Our algorithm, 1R2R, achieves strong performance on deterministic benchmarks,
and outperforms existing approaches for risk-sensitive objectives in stochastic
domains
Convex Hull Monte-Carlo Tree Search
This work investigates Monte-Carlo planning for agents in stochastic
environments, with multiple objectives. We propose the Convex Hull Monte-Carlo
Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree
Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective
planning in large environments. Moreover, we consider how to pose the problem
of approximating multiobjective planning solutions as a contextual multi-armed
bandits problem, giving a principled motivation for how to select actions from
the view of contextual regret. This leads us to the use of Contextual Zooming
for action selection, yielding Zooming CHMCTS. We evaluate our algorithm using
the Generalised Deep Sea Treasure environment, demonstrating that Zooming
CHMCTS can achieve a sublinear contextual regret and scales better than CHVI on
a given computational budget.Comment: Camera-ready version of paper accepted to ICAPS 2020, along with
relevant appendice
- …